## Using rownames(importance_df) as id variables
Endpoint has data from Day 2,3,10 - might be misleading since contains CFU from different data points
RF all mice/days together RF cfu vs community by day
## Using rownames(importance_df_day) as id variables
## Using rownames(importance_persist) as id variables
## Using rownames(importance_persist_day) as id variables
Eliminating mice euthanized early from the RF model gives similar R^2 and MSE, bu there is a slight advantage to community OTU features of Day 0 to predict Day1 CFU. As well as the R^2 value increases with increasing features, whereas when all mice are used the day 1 R^2 only decreases with increasing features. Of note, accompaied by this is an increase in the % MSE attributed to OTU15 (Akkermansia), which has seemed to stand out in all other days/analysis.Interestingly, akkermansia does not appear to have the same relationship when compared to the same day cfu and community. This could suggest akkermansia is promoting the intial colonization of c difficile. At first glance, OTU135 (Coriobacteriaceae) seems to stand out for predicting cfu of the same day as well as from day 0.
##
## Call:
## roc.default(response = as.numeric(Predict_early_euth_df$Euth_Early), predictor = as.numeric(rf_early_euth$predicted))
##
## Data: as.numeric(rf_early_euth$predicted) in 39 controls (as.numeric(Predict_early_euth_df$Euth_Early) 1) < 16 cases (as.numeric(Predict_early_euth_df$Euth_Early) 2).
## Area under the curve: 0.7812
Confusion Matrix
| No | Yes | class.error | |
|---|---|---|---|
| No | 39 | 0 | 0.0000 |
| Yes | 7 | 9 | 0.4375 |
OOB error rate = 12.7272727
Mice which were euthanized early, but the model predicts the will were not (Mouse Tag - Cage): 2084 IN1, 2091 OP, 2096 OUT, 2542 OP, 382 578, 389 578, NT INA